Representations of language in a model of visually grounded speech signal
نویسندگان
چکیده
We present a visually grounded model of speech perception which projects spoken utterances and images to a joint semantic space. We use a multi-layer recurrent highway network to model the temporal nature of spoken speech, and show that it learns to extract both form and meaningbased linguistic knowledge from the input signal. We carry out an in-depth analysis of the representations used by different components of the trained model and show that encoding of semantic aspects tends to become richer as we go up the hierarchy of layers, whereas encoding of formrelated aspects of the language input tends to initially increase and then plateau or decrease.
منابع مشابه
Pragmatic Representations in Iranian High School English Textbooks
Owing to the growing interest in communicative, cultural and pragmatic aspects of second language learning in recent years, the present study tried to investigate representations of pragmatic aspects of English as a foreign language in Iranian high school textbooks. Using Halliday’s (1978), and Searle’s (1976) models, different language functions and speech acts were specifically determined and...
متن کاملLearning Visually Grounded Words and Syntax of Natural Spoken Language
Properties of the physical world have shaped human evolutionary design and given rise to physically grounded mental representations. These grounded representations provide the foundation for higher level cognitive processes including language. Most natural language processing machines to date lack grounding. This paper advocates the creation of physically grounded language learning machines as ...
متن کاملUsing functional magnetic resonance imaging (fMRI) to explore brain function: cortical representations of language critical areas
Pre-operative determination of the dominant hemisphere for speech and speech associated sensory and motor regions has been of great interest for the neurological surgeons. This dilemma has been of at most importance, but difficult to achieve, requiring either invasive (Wada test) or non-invasive methods (Brain Mapping). In the present study we have employed functional Magnetic Resonance Imaging...
متن کاملUsing functional magnetic resonance imaging (fMRI) to explore brain function: cortical representations of language critical areas
Pre-operative determination of the dominant hemisphere for speech and speech associated sensory and motor regions has been of great interest for the neurological surgeons. This dilemma has been of at most importance, but difficult to achieve, requiring either invasive (Wada test) or non-invasive methods (Brain Mapping). In the present study we have employed functional Magnetic Resonance Imaging...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کامل